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Abstract 

The Quick Medical Reference (QMR) is a compendium of statistical knowledge con- 
necting diseases to findings (symptoms). The information in QMR can be represented 
as a Bayesian network. The inference problem (or, in more medical language, giving a 
diagnosis) for the QMR is to, given some findings, find the probability of each disease. 
Rejection sampling and likelihood weighted sampling (a.k.a. likelihood weighting) are 
two simple algorithms for making approximate inferences from an arbitrary Bayesian 
net (and from the QMR Bayesian net in particular). Heretofore, the samples for these 
two algorithms have been obtained with a conventional "classical computer" . In this 
paper, we will show that two analogous algorithms exist for the QMR Bayesian net, 
where the samples are obtained with a quantum computer. We expect that these two 
algorithms, implemented on a quantum computer, can also be used to make inferences 
(and predictions) with other Bayesian nets. 



1 Introduction 

Trying to make inferences based on incomplete, uncertain knowledge is a common 
everyday problem. Computer scientists have found that this problem can be handled 
admirably well using Bayesian networks (a.k.a. causal probabilistic networks) [1] . 
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Bayesian nets allow one to pose and solve the inference problem in a graphical fashion 
that possesses a high degree of intuitiveness, naturalness, consistency, reusability, 
modularity, generality and simplicity. 

This paper was motivated by a series of papers written by me, in which I 
define some nets that describe quantum phenomena. I call them "quantum Bayesian 
nets" (QB nets). They are a counterpart to the conventional "classical Bayesian nets" 
(CB nets) that describe classical phenomena. In particular, this paper gives an ex- 
ample of a general technique, first proposed in Ref. [2J, of embedding CB nets within 
QB nets. The reader can understand this paper easily without having to read Ref. [2] 
first. He might consult Ref. [2] if he wants to understand better the motivation behind 
the constructs used in this paper and how they can be generalized. 



Figure 1: Bayesian network of the same form as the QMR Bayesian network, but 
with considerably fewer parent ("diseases") and children ("findings") nodes. 

The Quick Medical Reference (QMR) is a compendium of statistical knowledge 
connecting diseases to symptoms. The original version of QMR was compiled by 
Miller et al[3]. Shwe et al[4j designed a CB net based on the information of Ref. [3]. 
The QMR CB net of Shwe et al is of the form shown in Figffl It contains two layers: a 
top layer of ^600 parent nodes corresponding to distinct diseases, and a bottom layer 
of ~4,000 children nodes corresponding to distinct findings. The inference problem 
(or, in more medical language, giving a diagnosis) for the QMR is to, given some 
findings, find the probability of each disease, or at least the more likely diseases. 

Making an inference with a CB net usually requires summing over the states 
of a subset S of the set of nodes of the graph. If each node in S contains just 2 states, 
a sum over all the states of S is a sum over 2' 5 ' terms. These sums of exponential 
size are the bane of the Bayesian network formalism. It has been shown that making 
exact [5] (or even approximate [6]) inferences with a general CB net is NP-hard. In 
1988, Lauritzen and Spiegelhalter(LS) devised a technique[7] for making inferences 
with CB nets for which the subset S is relatively small (for them, S = Sls = the 
maximal clique of the moralized graph). This led to a resurgence in the use of CB 
nets, as it allowed the use of nets that hitherto had been prohibitively expensive 
computationally. According to Ref.[8], for the QMR CB net, \Sls\ ~ 150, so the 
LS technique does not help in this case. Researchers have found(Ref. [8] gives a nice 
review of their work) many exact and approximate algorithms for making inferences 
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from the QMR CB net. Still, all currently known algorithms require performing an 
exponential number of operations. 

Rejection sampling and likelihood weighted sampling (a.k.a. likelihood weight- 
ing) are two simple algorithms for making approximate inferences from an arbitrary 
CB net (and from the QMR CB net in particular). Heretofore, the samples for these 
two algorithms have been obtained with a conventional "classical computer" . In this 
paper, we will show that two analogous algorithms exist for the QMR CB net, where 
the samples are obtained with a quantum computer. We will show that obtaining 
each sample, for these two algorithms, for the QMR CB net, on a quantum computer, 
requires only a polynomial number of steps. We expect that these two algorithms, 
implemented on a quantum computer, can also be used to make inferences (and pre- 
dictions) with other CB nets. 

2 Notation 

In this section, we will define some notation that is used throughout this paper. For 
additional information about our notation, we recommend that the reader consult 
Ref.jn]. Ref. [I9J is a review article, written by the author of this paper, which uses the 
same notation as this paper. 

Let Bool = {0, 1}. As usual, let Z, IR, C represent the set of integers (negative 
and non-negative), real numbers, and complex numbers, respectively. For integers a, 
b such that a < b, let Z a ^ — {a, a + 1, . . . b — 1, b}. For any set S, let \S\ be the number 
of elements in S. The power set of S, i.e., the set of all subsets of S (including the 
empty and full sets), will be denoted by 2 s . Note that |2 S | = 2 |5 L 

We will use 9(S) to represent the "truth function"; 9(S) equals 1 if statement 
5* is true and if S is false. For example, the Kronecker delta function is defined by 
5y = 5(x,y)=9(x = y). 

Random variables will be represented by underlined letters. For any random 
variable x, val(x) will denote the set of values that x can assume. Samples of x will 
be denoted by x^ k ' for k E Zi,N sam - 

Consider an n-tuple / = (/i, / 2 , . . . , / n ), and a set A C Z\ }U . By (/)a we 
will mean (/j)igyi ; that is, the |A|-tuple that one creates from /, by keeping only 
the components listed in A. If / e Bool n , then we will use the statement / = to 
indicate that all components of / are 0. Likewise, / = 1 will mean all its components 
are 1. 

For any matrix A e C pxq , A* will stand for its complex conjugate, A T for its 
transpose, and A* for its Hermitian conjugate. When we write a matrix, and leave 
some of its entries blank, those blank entries should be interpreted as zeros. 

For any set fl and any function / : Q — ► M, we will use f(x) / (J2 x en numerator) 
to mean f(x) / (J2 x en f( x ))- This notation is convenient when f(x) is a long expression 
that we do not wish to write twice. 



3 



Next we explain our notation for quantum circuit diagrams. We label single 
qubits (or qubit positions) by a Greek letter or by an integer. When we use integers, 
the topmost qubit wire is 0, the next one down is 1, then 2, etc. Note that in our 
quantum circuit diagrams, time flows from the right to the left of the diagram. Careful: 
Many workers in Quantum Computing draw their diagrams so that time flows in the 
opposite direction. We eschew their convention because it forces one to reverse the 
order of the operators every time one wishes to convert between a circuit diagram 
and its algebraic equivalent in Dirac notation. 

3 The QMR CB Net 

In this section, we describe the QMR CB net. 




Figure 2: (a) CB net with n parent nodes ("diseases") for all pointing into a single 
child node ("finding"), (b) Noisy-or CB net, a special case or an approximation of 
the CB net of Figure (a). 

Before describing the QMR CB net, let us describe the noisy-or CB net (in- 
vented by Pearl in Ref. [10J). Consider a CB net of the form of FigfS^a), consisting 
of n parent nodes ("diseases"), d- 6 Bool with j 6 Z ln , all pointing into a single 
child node ("finding"), / e Bool. The CB net of Figj2]^a) represents a probability 

distribution P(f, d) that satisfies: 

n 

P(f,d)=P(f\d)l[P(d J ). (1) 

We say the probability distribution of Eq.flT]) and Figj2]^a) is a noisy-or if it also 
satisfies: 
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(2a) 



with 



P{f\d')=5{fJ l Vd l 2 V---Vd l n ). 



(2b) 



For example, when n = 2, 
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Eqs.(J2D are represented by Figj2]^b). Sometimes, one also restricts the distributions 
P(dj\dj) to have the special form: 

{ i o qij 

= (l-q lj ) d ^ df . + (q lj d j )6 1 d ,, (6) 

where q\j G [0, 1]. A general distribution P{d'-\dj) would contain 2 degrees of freedom 
whereas Eq.([6]) contains only one, namely qij. Note that 



P(f = 0\d) = H(l- gi ^=e-^ 



(7) 



where Qij = — ln(l — qij) > 0. The inference problem for the noisy-or CB net consists 
in calculating P(d\f = 0) and P(d\f = 1); that is, the probability of diseases having 
the value d, given that / is or 1. This is given by Bayes rule: 



W = o) 



P(f = 0\d)U 3 P(d J ) 

P(f = 0) 
UAa-Hj^Pidj)} 



J2d numerator 

Note that the sum in the denominator of Eq. ([9]) is over 2 n terms. 



(9) 
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Now that we understand the noisy-or CB net, it's easy to understand the QMR 
CB net. The QMR CB net consists of multiple noisy-or CB nets, one for each finding. 
Suppose the QMR CB net has N D diseases (parent nodes), d- G Bool for j G Z 1iNd , 
and N F findings (children nodes), /. G Bool for i G Z 1<Nf . Then, for each i G Z 1)Np , 
one has3 

P(f i = 0\(d) MO )= J] {(l-g^}=e- E ^'^, (10) 
iepa(/.) 

where G [0, 1] and pa(f.) C Zi iNd is the set of parents of node Let I , Ii and I un k 
constitute a disjoint partition of Z\^ F . ("unk" stands for unknown.) The inference 
problem for the QMR CB net consists in calculating P[d\(f)i = 0, (/)/ 1 = 1]. By 
Bayes rule, 

Wk = o, (/), = !] = m\ T o,(f\ : imd) 

Wk = o,(/k = i] 



h,Io 



(12) 



where 



n °=n n K 1 -^} > ( i3 ) 

ieIoj<Epa(l.) 

n^nl 1 - n K 1 | > ( i4 ) 

ieh I jepatf.) I 



and 



P h ,i = P[(f)i o = 0,(f) h = l] (15) 

= "£p[(f) Io = o,(f) h = i,d]. (i6) 

d 

Note that the numerator of Eq. (fl2|) can be calculated in a polynomial num- 
ber of steps, but its denominator (i.e., Pi u i ) is expressed in Eq. flTBl) as a sum over 

1 It's possible to include "leakage" in the definitions of noisy-or and QMR nets, but we won't 
include it since it can be ignored without loss of generality. One can add a leakage node Z, 6 Bool 
pointing into each /. node, for each i £ Z\^ F . These leakage nodes behave just like disease nodes 
that are always "turned on" (i.e., set to 1). Then, instead of Eq. fTU)) . one has 

P(Ji = 0|L< = 1, (d) pa{L) ) = (1 - q i0 ) [] {(! - Hi) ds } = e- ei °-^^ 9 «*> . 

j'Spa(/.) 
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2 n d terms. Calculating Pi x i naively, by summing numerically those 2 N ° terms, is 
unfeasible when Njj is large. 

At the end of this paper are 4 appendices. Reading them is not a prerequisite 
to understanding the rest of this paper, but they might be of interest to some readers. 

In Appendix |AJ we show that 

Phj = E(- 1 ) |s ' T *'o> ( 17 ) 

sch 

where T : 2 h x 2 T ° -> R is some function that can be calculated in a polynomial 
number of steps. Thus, Pi lt i can be calculated by summing numerically over 2' Jl ' 
terms, regardless of |/o| size. This is better than 2 N ° terms, but still unfeasible for 
| Ji| large. 

Eq. (fTTl) can be inverted. For more on this see Appendix [B] 
Rejection sampling and likelihood weighted sampling are two algorithms for 
making approximate inferences from an arbitrary CB net (and from the QMR CB net 
in particular). Heretofore, the samples for these two algorithms have been obtained 
with a conventional "classical computer" . In case the reader is not familiar with these 
two algorithms, in the manner they have been implemented heretofore on a classical 
computer, see Appendices O and [D] for an introduction to them. In the next section, 
we will show that two analogous algorithms exist for the QMR CB net, where the 
samples are obtained with a quantum computer. 

4 Diagnosis Via Quantum Computer 

In this section, we will describe a method for making inferences from the QMR using 
a quantum computer. 

A slight change of notation: the parameter G [0, 1] of the previous section 
will be replaced in this section by a sine squared. Let 

qij = sin 2 aiij = Sfj , 1 - g„ = cos 2 a {j = C 2 - , (18) 

for some real number a^. We will also abbreviate Sy by Sj and Cy by Cj. 

We begin by considering the simple case of a CB net consisting of two diseases 
pointing to one finding, as displayed in Figj3]^a). We will next show that Figj3](b) 
is a quantum circuit that can generate some of the same probability distributions as 
the CB net FigJS^a). The state vectors ^2)1 and the unitary transformations 
Ai, A 2 , A r that appear in the quantum circuit of FigJ^b) are defined as follows. 

For j = 1, 2, define 1^) by 

l^> = ^|0>, (19) 

where 
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Figure 3: (a) CB net consisting of two diseases pointing to one finding, (b) Quantum 
circuit that generates some of the same probability distributions as the CB net of 
Figure(a). 



U; 



For j = 1, 2, let 
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(20) 
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{Cf)5%5% + (S^sf si 4 + [...] 4, 



(21) 



(22) 



For those familiar with Ref.[2], note that the probability amplitude Aj(d'j,dj\d'pdj) 
is a q-embedding of the probability distribution P(d'j\dj) defined in Eq.((6]). Note also 
that source and sink nodes are denoted by letters with tildes over them. 

The matrix given by Eq.(T2T]) is a 2 qubit unitary transformation. Such trans- 
formations can be decomposed (compiled) into an expression containing at most 3 
CNOTs, using a method due to Vidal and Dawson [H] (For software that performs 
this decomposition, see Ref.[T2]). 

Let 
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AoR{f,e 1 ,e 2 \f,d' 1 ,d' 2 ] 



000 001 010 Oil 100 101 110 111 



000 

/,e!,e 2 | 001 
010 
Oil 
100 
101 
110 

111 











(23) 



= [< / «f V ^]^+ [...]*>■ (24) 

For those familiar with Ref.[2], note that the probability amplitude Aon(f, ei,e 2 \f,d[, d 2 ) 
is a q-embedding of the probability distribution P(f\d' l , d' 2 ) defined in Eq.(j4]). 
The matrix given by Eq. fT23l can be compiled as follows: 



[AoR{f,ei,e 2 \f,d' 1 ,d^] 



e «f <Tx®E(6,6/) eSoo ;2_ (0> o) P b,b> 
g i § <T X ®i4 g-J f 0"X ® -POO 



(25) 
(26) 
(27) 



= iajK2)H^2)] B(0)B(1) . 
The probability P(/, ei, S2, di, e^) for the quantum circuit FigOJ^b) is given by: 



P(f,e 1 ,e 2 ,di,d 2 ) 



£ ^W/,ei,e 2 |/ = 0,^,4) II {^(^iK = ^ d j)\fP(dj 



In particular, when / = 0, 

P(/ = O.ei.^dx,^) = II C f ip &)t% ■ 

3=1,2 

If ei and e 2 are not observed, we may sum over them to get 



(2? 



(29) 



(30) 



P(f = 0,d u d 2 )= J] Cfp{d 5 

J=l,2 



(31) 
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If we replace dj by dj, P(f = 0, d\, c? 2 ) for the quantum circuit FigJS^b) is identical 
to P(f = 0,di,d 2 ) for the CB net Figj3]^a). This is no coincidence. The quantum 
circuit was designed from the CB net to make this true. In a sense defined in Ref . [2] , 
the CB net is embedded in the quantum circuit. 
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Figure 4: (a) QMR-like CB net with two diseases and three finding, (b) Quantum 
circuit that generates some of the same probability distributions as the CB net of 
Figure(a). 



One can easily generalize this example with Nd = 2 and Np = 1 to arbitrary 
Nd and iV^. FigJH gives an example with = 2 and iV^ = 3. In the example with 
N D = 2 ,N F = 1, we set: 

[AoR(J&&\fA i d£] = ia x (2)[-ia x (2)] n ^ . (32) 
For arbitrary Nd, Np, this equation can be generalized to: 



[Aoiiifi, { e j}je P a(i z )\fi, {djhepaqjl = ivx(Ti)[-ia x (Ti)] UKeK i ™ W , (33) 

for i G Zx^jr, where T{ is the qubit label of qubit /., and is the set of qubit labels 
for the parents of qubit /.. 

For arbitrary Nd, Np, we can generalize this construction to obtain a quantum 

circuit that yields probabilities P(f,e,d). If the external outputs e are not observed, 

then we measure P(f, d). If we replace d by d, the probability P(f, d) for the quantum 
circuit is identical to the probability P(f, d) for the CB net that was embedded in 
that quantum circuit. As discussed previously, the inference problem for the CB net 
is to find P[d\(f) Io = 0, (f) h = 1). This probability equals P[(f) Io = 0, (f) h = lj] 
divided by P[(f) Io = 0,(f) h = 1]. The numerator P[(f) Io = Q,(f) h = I J] can 
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be calculated exactly numerically on a conventional classical computer. Not so the 
denominator P[(f)i = 0,{f)h — 1]) at least not for large Here is where the 

quantum computer shows its mettle. One can run the quantum circuit many times, 
in either of two modes, to get a so called empirical distribution that approximates 

— * — * — * 

P[d\(f)i = 0, (/)/! = 0]. The empirical distribution converges to the exact one. The 
two modes that we are referring to are rejection sampling and likelihood weighted 
sampling. We describe each of these separately in the next two sections. 



4.1 Rejection Sampling 



Assume that we are given the number of samples N sam that we intend to collect, 
and the sets Io,Ii,I U nk which are a disjoint partition of Z\^ F . Then the rejection 
sampling algorithm goes as follows (expressed in pseudo-code, pidgin C language): 



For all d{W(d) = 0; } 
W tot = 0; 

For samples k — 1,2, ... , N sam { 

Generate {S k \ /^) with quantum computer; 
If (/ (fe) )/ = and (/^)/! = 1{ (//rejection here 
If = d{W(d) + +;} 
W tot + +; 

} 

}//k loop (samples) 

For all d{P[d\(f) Io = 0, (f\ = 1] = ™; } 



A convergence proof of this algorithm goes as follows. For any 

BooI Nd+Nf - 



unction g : 



-> R, as N sam — > oo, the sample average g{d^ k \ f^) tends to: 
gWJM) = T7^ E 3^ k \ f k) ) - E P ^> ?)9& /') • ( 34 ) 



d',f 



Therefore, 



1 Y" rO xl Xd 



(36) 



P[d\(f)i = 0, (f\ = 1] . (37) 
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4.2 Likelihood Weighted Sampling 



For likelihood weighted sampling, the quantum circuit must be modified as follows: 
We assume that all gates in the quantum circuit are elementary; that is, either 
single-qubit transformations or controlled elementary gates (like CNOTs or multiply- 
controlled NOTs or multiply-controlled phases). 

1. For any qubit / with i G ii, initialize the qubit to state |1). (For any qubit / 
with i G Io, initialize the qubit to state |0), same as before.) 

2. For any qubit / with i G Io U ii, remove those elementary gates that can 
change the state of / . In particular, remove any single-qubit gates acting on 
/., and any controlled elementary gates that use /. as a target. Do not remove 
controlled elementary gates that use / as a control only. 

Assume that we are given the number of samples N sam that we intend to 
collect, and the sets Io,h,I U nk which are a disjoint partition of Zi >Np . Then the 
likelihood weighted sampling algorithm goes as follows (expressed in pseudo-code, 
pidgin C language): 



For all d{W{d) = 0; } 
W tot = 0; 

For samples k — 1, 2, ... , N sam { 

Generate (S^ k \f^) subject to (f)i = 0, (f)^ = 1 with quantum computer; 

l = ru m (k) = qk^w,)] n ieh m (k) = w {k) u Li) }; 

If #) = d{W(d) + = L; } 
Wtot + = L- 
}//k loop (samples) 

For all d{P[d\(f) Io = 0, (f\ = 1] = ™; } 



A convergence proof of this algorithm goes as follows. Define the likelihood 
functions L evi and L un k by ("evi" stands for evidence and "unk" for unknown): 

L em (d) = n p[f t = o\(d) MLi) \ n p[u = i\(d) MLi) \ , (38) 

tela ieii 

and 

L U nk(d,f)= J] P[fi\{df) pa{f J . (39) 

Clearly, 

P(d, f) = L evi (d)L unk (d, f)P(d) . (40) 
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For any function g : BooI Nd+Nf K, as N sam — > oo, the sample average g{S k \ f^) 
tends to: 

g{^ k \f^) = 1 ^-Y,9^ k \f {k) ) - J^S^ ) Jlp )l L mk ^J)P^)g(d!J ) • 
sam k d'j' 

(41) 



Therefore, 



to? 



(42) 
(43) 



- P[d|(/)/ o =0, (/)/, = !]. (44) 

A Appendix: Summing P[(f) lQ = 0, (7)^ = 1, <i| 
over c? 

In this appendix, we will sum P[(f)i = 0, (f)^ = 1, <f] 
over d. This is like performing a multidimensional integral. 
Recall that Pi u i was defined as: 

= E P K/5* = 0, (/")/, = 1, A • (45) 
d 

For all i G ^i,jv f and j G Z\^ D , let 

For all j G Zx^m we can always find a,,/?? G R so that P(dj) can be expressed as: 

= e^-M . (47) 
Now n (defined by Eq. tjTBl ). IIi (defined by Eq. (TT4l ). and P(d) can be expressed as: 

U = l[e-^ d \ (48) 

ielo 

n 1= J] jl-e^} , (49) 
i'e/i 
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and 



P(d) = e a P' d , where a = ^ . 



N D 



3=1 

Thus 



p Il>Jo = ^n n 1 p(rf) 

( -a J2 e-t-Z-ZiVo "Q h _,-T,, -I 



Consider any set Q and any function / : Q — > R. When f2 = {a, 6}, 

(1 - e -'M)(l - e~^) = 1 - e^^ - e"^ + e^"^ 
This generalizes to 

JJ {1 - e-W} = (-lf l e-^sm . 
sen se2 n 
Using identity Eq.([54]), Eq.flSS) yields 



P hJo = e - ^(_i)IS|^ e -^-E ws ^. 

5c/i d - 

For any j G -Zi^ and A C Z\,n f i define 



Also define a function t : K — > R by 



d=0 

Using these definitions, Eq. fl55l) yields 

= $:(-i)i s 'T 5>/0 . 

sc/i 

T is a function T : 2 !l x 2 /o — > R defined by the last equation. 
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B Appendix: Mobius Inversion Theorem and Eq. (EH) 



In this Appendix, we discuss the application of the Mobius Inversion Theorem[T3] to 
Eq.dHD- 



S e2» 



[ Ps s o] - 



S,£2 



Figure 5: The matrix P t 



1 






P K 






























• 

• 

• 










Pl i'o 



Si, So 



P[(f) Sl = 1, (f) So = 0] for all S 1 G 2^ and S G 2 7 °. 



Fig(5] shows the matrix P Sl>So = P[(f)s 1 = 1, (f)s = 0] for all S 1 G 2 h and 
Sq G 2 /o , assuming large but arbitrarily |/ |- We label the rows and columns of 
Ps 1 ,s in order of increasing set size. The top-left corner entry is P$ t $ = 1 and the 
bottom-right corner entry is Pi u i - Note that Ps!,s > PhJo f° r a ^ Sx <Z\ Ix, So <Z\ Io- 
The shaded top part (corresponding to small or moderate \S\\) of this matrix can be 
calculated numerically with a classical computer. But not the unshaded bottom part 
(corresponding to large |Si|). An empirical approximation of the bottom part can be 
obtained with a quantum computer. 

Consider any set J and any functions /, g : 2 J — > C. The Mobius Inversion 
Theorem IT51 states that 



9(J) = E(- 1 ) |J_J ' l /(^) <=► M = E 9(J') • (60) 
J'cJ J'cJ 

Using the fact that when J' C J, \ J—J'\ = \ J\ — \J'\, and replacing g( J) by (— 1)' J '(?(J) 
in the previous equation, we get 



/(■o = £(-i) |J V-o- 



J'CJ 



J'CJ 



We showed in Appendix [A] that 



Plulo = E (- 1 ) |Sl| ^ ll /o 
SiC/i 



(61) 



(62) 
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Thus, by virtue of Eq. (l6TI) . 



T h ,i = • (63) 

More generally, if S[ C Ii, So C Jo, and 



M s >, Sl = (-l) Sl , (64) 



then 



and 



Eq-dMD implies 



h,Io 



P s[>So = M si , Sl T SuSo , (65) 
Sics[ 



T s >,s = Yl Ms[,s 1 Ps 1 ,s ■ (66) 



-l) 1 ' 1 ' { T h ,i - i-^Psuio } ■ (67) 



To approximate Pi lt i , one can estimate the right hand side of the last equation. Ti lt i , 
and Psxjo for small and moderate |5i|, can be calculated exactly numerically on a 
classical computer. Ps lt i f° r l ar g e can be obtained empirically on a quantum 
computer. 



C Appendix: Rejection Sampling 

for CB Nets on a Classical Computer 

In this Appendix, we review the rejection sampling algorithm for arbitrary CB nets 
on a classical computer. 

Consider a CB net whose nodes are labeled in topological order by x ± ,x 2 , . . . x Nnds 
./'. Assume that E (evidence set) and H (hypotheses set) are disjoint subsets of 
Zi^ nds , with Zi t N nda — E U H not necessarily empty. Assume that we are given the 
number of samples N sam that we intend to collect, and the prior evidence (x)e- Then 
the rejection sampling algorithm goes as follows (expressed in pseudo-code, pidgin C 
language): 
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For all {x) H {W[{x) h \ = 0; } 
W tot = 0; 

For samples k = 1, 2, ... , N sam { 
For nodes i = 1,2,..., iV„ ds { 

Generate x^ from -P[xj|(x^^)p a ( £ .)]; 

/ /Note that pa(xj C Zi j_i so 

//(^ (fc) ) p°fe) nas been calculated at this point 
}//i loop (nodes) 
If (x^)e = (x)£;{//rejection here 

If (x^) H = (x) H {W[(x) H ] + +;} 

W tot + +; 

} 

}//k loop (samples) 

For all (x) H {P[(x) H \(x) E ) = ™fl; } 



A convergence proof of this algorithm goes as follows. For any function g 

val(x) — > R, as iV sajn — >■ oo, the sample average g(x^) tends to: 



, A x«) = ^^(* (fc) ) - ^P(x')^x') . (6* 



iV 

fe 



Therefore, 



_ J2 X > P(x')5[(x) EuH , (x')euh] (7(]] 

P[(x)euh] m) 
P[{x)e] • 1 1 



D Appendix: Likelihood Weighted Sampling 
for CB Nets on a Classical Computer 

In this Appendix, we review the likelihood weighted sampling algorithm for arbitrary 
CB nets on a classical computer [HI [T5] . 

Consider a CB net whose nodes are labeled in topological order by x_ x , x 2 , . . . x N 
x. Assume that E (evidence set) and H (hypotheses set) are disjoint subsets of Z\^ nd3 , 
with Z\,N nd » — EUH not necessarily empty. Let X c = Z lt ^ nds —X for any X C Z 1<Nnd3 . 
Assume that we are given the number of samples N sam that we intend to collect, and 
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the prior evidence (x)e- Then the likelihood weighted sampling algorithm goes as 
follows (expressed in pseudo-code, pidgin C language): 



For all (x) H {W[(x) h } = 0; } 
W tot = 0; 

For samples k — 1,2, ... , N sam { 
L = 1; 

For nodes % = 1, 2, . . . , N nds { 
If i E E c { 

Generate from P[xi\{x^)pa(x {)]'■> 

I /Note that pa(x i ) C so 

/ / (x^)pa(x.-j has been calculated at this point 
}else if % e E{ 

xj® = Xi, / /(x)e known 

L * = P[^|(xW) pafe .)]; 

} 

loop (nodes) 

If (x^) H = (x) H {W[(x) H ] + = L; } 

W tot + = L; 
}//k loop (samples) 
For all (x) H {P[(x) B \(x) E ] = ^i;} 



A convergence proof of this algorithm goes as follows. Define the likelihood 
function: 

L A (x) = l[P[x t \(x) pa ^ ) } (72) 

ieA 

for any A C Z ltNnds . Clearly, 

P(x) = L E (x)L EC (x) . (73) 
For any function g : val(x) — > R, as N sam — * oo, the sample average g(x^) tends to: 

^iW) = J- ^ ^(x^) - £ L EC (x')5[(x) E , (x')*]^') . (74) 

^ sam , , 
k x' 

Therefore, 
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W[(x) H ] j^^Mx^)6[(x) H ,(xW) H ] 

P(x')S[(x)euh, (x')euh] (? v 

_ P[(x)euh] (?7 , 
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